For the next steps, I would like to switch to the tidyverse. The tidyverse is an R package. In fact, it is actually a collection of various packages that come with many handy functions for data wrangling and visualisation. It is perfectly normal to use various R packages for additional functionalities and you will “collect” a lot of them over time. However, the packages of the tidyverse are a little bit special. The tidyverse has it’s own “coding style” which works slightly different from what we’ve learned so far. This has lead to a whole war between “base R” and “the tidyverse”, with many hardliners on both sides. Sometimes, the tidyverse is described as being especially beginner friendly and, well … tidy. Personally, I don’t know what all the fuzz is about: Both base R and the tidyverse have their pros and cons and honestly, I think they work best when used together. I use whatever I think works best for the task at hand - it’s not “either or”. In this course, I have to make decisions about what to teach you, and I will mainly show you the tidyverse way of doing it, even though I’m a huge fan of base R. But I can either teach you to do one thing in 50 different ways, or teach you how to do 50 different things instead, so I’m going with the latter. If you are interested, I can share materials with you that teach you how to do things “the base way”.
Let’s install the tidyverse together. You can install any package using the function install.packages(). It takes the package name as an argument. If you start typing the package name without the quotation marks, R will start suggesting packages to you which are directly available on CRAN (this is also where we installed R from).
# install the tidyverse
# install.packages("tidyverse")However, just because you have a package, that does not mean you can use it. Packages only become “active” if you load them with the function library(). This seems a bit effortful, but actually has good reasons: Because there are so many packages, a lot of the use the same names for different functions. One function in one package can do an entirely different thing than the “same” function in a different package! So, if all packages would be loaded, it might be a little bit difficult to avoid confusion. Usually, you want to make a conscious decision about which packages you want to load in your script. Once a package is loaded, it stays loaded throughout the entire R session. Let’s load the tidyverse.1
# load the tidyverse
library(tidyverse)One of the most popular and noticeable features of the tidyverse is the pipe operator, which looks like this: %>%.2 It is used to create chains of functions, which means that you put functions after each other, linking them with the pipe operator. In base R, you achieve the same thing with a more “onion-like approach”. Let me show you what I mean by this. Say we have this vector:
example_vector <- c(1, 3, 33, 50, 5, 1, 2)We want to take the mean of this vector (using the mean() function), and then round the result to two decimal places (using the function round()). In base R, we would stack the functions into each other.3 They are then executed from the inside towards the outside.
# take the mean of the example vector and round the result to two decimal places (base R style)
round(mean(example_vector), 2)## [1] 13.57
We could also do the same thing stepwise (and sometimes this is the best approach if you want your code to be readable).
# same as above, but in two steps
mean_example_vector <- mean(example_vector)
round(mean_example_vector, 2)## [1] 13.57
In the tidyverse, we chain those functions together using the pipe operator:
# this time: tidy style!
mean(example_vector) %>% round(2)## [1] 13.57
example_vector %>% mean() %>% round(2)## [1] 13.57
Note how we can leave out the actual data argument every time. mean() doesn’t use any argument at all. That is because the pipe “delivers” example_vector to the mean() function. That means, mean() does get an argument, only that it’s not coming from within the brackets, but from the pipe to its left. We see that round() still has the number 2 as its second argument, indicating that we want to round to two decimal places. However, the first argument has been omitted. That’s again because the data - the mean of example_vector in this case - is coming from the left. Let’s practice chaining some functions together.
1.1) Turn this “function onion” into a tidyverse chain:
(Can you find out what the function abs() does?)
another_vector <- c(3, -2, 5, 99, -132.5)# use abs on the vector, then take the mean and transform the result into a character
as.character(mean(abs(another_vector)))## [1] "48.3"
another_vector %>% abs() %>% mean() %>% as.character()## [1] "48.3"
1.2) Let’s make things a bit trickier with additional arguments.
vector_with_NAs <- c(3, -2, 5, NA, 99, -132.5, NA)# use abs on the vector, show the first 4 elements with head, take the mean (with na.rm = TRUE)
# and turn into character
as.character(mean(head(abs(vector_with_NAs), 4), na.rm = TRUE))## [1] "3.33333333333333"
vector_with_NAs %>%
abs() %>%
head(4) %>%
mean(na.rm = TRUE) %>%
as.character()## [1] "3.33333333333333"
1.3) How do we get this into a pipe?
mean(another_vector * 2)## [1] -11
This does not work:
another_vector * 2 %>% mean()## [1] 6 -4 10 198 -265
(another_vector * 2) %>% mean()## [1] -11
The pipes are all fun and games for vectors, but they unlock their true potential when applied to data frames. Furthermore, the tidyverse offers a lot of functionalities for “data wrangling”, i.e. cleaning, formatting and processing data. Let’s load in the nerd data again.
nerd <- read.csv("./data_sets/nerd_data_short.csv", sep = "\t")The data contains a lot of information we don’t want to look at for now. How about selecting only the columns we want to work with right now? This is what the select() function does. We pipe our nerd data into it and then select the columns we want to keep. I want to keep the participants’ age, the gender, whether they’re married or not, and all of the “nerdy” questions, e.g. Q1 - Q26.
# select age, gender, married, and Q1 - Q26
nerd %>%
select(age, gender, married, Q1:Q26)nerd %>%
select(age, gender, married, starts_with("Q"))# the base R way
nerd[c("age", "gender", "married", paste0("Q", 1:26))]A few remarkable things just happened. First of all, look at how I’m addressing the columns: I’m just using their names, without writing the name of the data frame again. That is, I’m writing age instead of nerd$age or nerd["age"]. That is not self-evident at all and a feature of the tidyverse’s “tidy evaluation”, called “data masking”. That means, you can use the columns of your data frame as if they were variables in your environment. You don’t have to refer to the data source (nerd, in this case), in order to access the columns. The tidyerse “knows” what you are referring to because you piped the main data source into the select() function.
And then there’s the fact that we don’t have to type out all of the columns Q1 - Q26 - we can actually just write exactly that: Q1:Q26 means column Q1 to Q26. Also note how the columns are now displayed in the order we typed them. Had we chosen a different order, our data would have been ordered differently as well.
Sometimes, you want to bring one column to the front, but you don’t really want to change anything else. You don’t want to name all the other columns, so there’s a shortcut for that: everything(). It literally throws in all the remaining columns you haven’t selected yet and is great for reordering. Say that, for some reason, we wanted the column voted (whether or not someone voted in the past election) to go first, then the age, and then all the rest. The corresponding code would look like this:
# select voted, age and then the rest of the data frame
nerd %>%
select(voted, age, everything())Don’t forget the () behind everything!
1.4) I want to continue working with a reduced version of our data that only contains the columns age, gender, married, voted, nerdy and also Q1 - Q26. Can you select these columns and save this reduced data frame in a variable called nerd_red (“nerd reduced”)?
nerd_red <-
nerd %>%
select(age, gender, married, voted, nerdy, Q1:Q26)There are further options for selecting subsets of data. We just made a selection based on columns, but we can also only select certain rows. This can be achieved with the filter() function. Let’s for example only filter out the female participants, which are coded as 2 in the gender column.
# filter participants where the gender is 2
nerd_red %>%
filter(gender == 2)# the base R way
nerd_red[nerd_red$gender == 2, ]We use the same notation as for logical comparisons, so make sure to use == instead of =! Here, we filter for women below the age of 30.
# filter for women below the age of 30 - use &
nerd_red %>%
filter(gender == 2 & age < 30)Interestingly, the logical & can be replaced with a comma in filter().
# filter for women below the age of 30 - use ,
nerd_red %>%
filter(gender == 2, age < 30)Filter nerd_red. Feel free to look at the code you wrote earlier in the course for the logical comparisons. Choose …
2.1) … anyone who is either 22 or 26 years old. There are several ways to achieve this!
nerd_red %>%
filter (age == 22 | age == 26)nerd_red %>%
filter(age %in% c(22, 26))c("tea", "biscuits", "cake") %in% "tea"## [1] TRUE FALSE FALSE
"tea" %in% c("tea", "biscuits", "cake")## [1] TRUE
2.2) … anyone who is either a man (gender == 1) or older than 30.
nerd_red %>%
filter(gender == 1 | age > 30)2.3) … anyone who is non-binary (gender == 3) and who is currently married or has previously been (1 = Never married, 2 = Currently married, 3 = Previously married).
nerd_red %>%
filter(gender == 3 & married != 1)nerd_red %>%
filter(gender == 3, !married == 1)nerd_red %>%
filter(gender == 3 & married == 2 | gender == 3 & married == 3)nerd_red %>%
filter(gender == 3, married %in% 2:3)2.4) The column nerdy contains the agreement to the statement “I see myself as nerdy” (1 = disagree strongly - 7 = agree strongly). How many women have the highest self-reported nerdiness? How many have the lowest self-reported nerdiness? What about the men? Solve this task using filters, but also try table().
nerd_red %>%
filter(gender == 2, nerdy == 7) %>%
nrow()## [1] 135
nerd_red %>%
filter(gender == 2, nerdy == 1) %>%
nrow()## [1] 12
table(nerd_red$nerdy)##
## 0 1 2 3 4 5 6 7
## 1 20 31 36 126 234 311 241
table(nerd_red$nerdy, nerd_red$gender)##
## 1 2 3
## 0 1 0 0
## 1 8 12 0
## 2 18 12 1
## 3 14 21 1
## 4 40 77 9
## 5 82 141 11
## 6 100 196 15
## 7 87 135 19
nerd_table <- nerd_red %>%
group_by(gender, married) %>%
count(name = "number_of_people")nerd_table %>%
ungroup()When looking at the data summary, we already noticed that some of the reported values for age are impossible. We want to filter those out - what would you consider to be a good cut-off value? Also, marital status is supposed to be coded on a scale from 1 - 3, but there are five people where the entry is 0. We want to exclude these guys as well.
# table the married column
table(nerd_red$married)##
## 0 1 2 3
## 5 841 115 39
nerd_red %>%
group_by(married) %>%
count()summary(nerd_red$age)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 13.00 17.00 20.00 63.29 27.00 38822.00
Note that there is no output when I assign something to a variable, so I have to call the variable nerd_red again on order to display it.4
# filter out people with implausible marriage status and age
# overwrite nerd_red
nerd_red <-
nerd_red %>%
filter(married %in% 1:3,
age %in% 0:100)nerd_red %>%
filter(married %in% 1:3,
between(age, 0, 100))2.4) After filtering, what is the maximum age?
max(nerd_red$age)## [1] 90
nerd_red$age %>% max()## [1] 90
nerd_red %>% max(.$age)## [1] 90
2.5) How many people of each gender are included in the data?
table(nerd_red$gender)##
## 1 2 3
## 349 589 56
nerd_red %>%
group_by(gender) %>%
count()nerd_red %>%
count(gender)2.6) How many people per marital status do we have?
nerd_red %>%
count(married)2.7) How many people did we exclude in total?
nerd %>% nrow() - nerd_red %>% nrow()## [1] 6
nrow(nerd) - nerd_red %>% countnrow(nerd) - nrow(nerd_red)## [1] 6
2.8) Check whether the nerdy column has only valid values, i.e. values from 1 - 7. What would your strategy be? Filter out invalid values if necessary.
nerd_red %>%
count(nerdy)nerd_red <-
nerd_red %>%
filter(nerdy %in% 1:7)2.9) Check whether the voted column has only valid entries (1 or 2). Filter out invalid values if necessary.
nerd_red %>%
count(voted)nerd_red <-
nerd_red %>%
filter(voted %in% 1:2)We know how to extract a vector from a data frame.
3.1) Quick recap: Save the age column into a single vector.
age <- nerd$ageBut what if we modified our data frame with a pipe and want to quickly extract one column? Sure, we could save the modified data frame and then extract the column.
3.2) Save a data frame where you filter all participants who are younger than 20. Then extract the age column.
nerd_young <- nerd_red %>%
filter(age < 20)
age_young <- nerd_young$ageHowever, we can do all of this in one go using the pull() function.
# Filter the data frame for participants who are younger than 20 and extract the age
# column using pull()
nerd_red %>%
filter(age < 20) %>%
pull(age)## [1] 17 18 18 17 15 18 16 15 17 14 17 18 15 16 14 17 15 17 18 19 13 16 17 15 19
## [26] 18 18 14 16 14 17 14 13 16 15 14 18 14 14 15 17 17 19 14 18 19 17 16 15 16
## [51] 15 18 16 16 14 18 17 13 17 14 14 15 17 18 17 14 16 19 19 18 17 15 16 16 18
## [76] 15 18 17 19 17 16 17 18 18 18 16 17 14 19 14 18 16 17 18 15 18 13 17 16 17
## [101] 17 17 13 16 18 18 18 19 17 13 16 14 19 13 15 16 14 18 13 19 15 15 16 13 13
## [126] 18 19 15 17 17 18 18 17 19 16 18 16 17 19 19 18 18 18 16 13 15 16 18 18 18
## [151] 15 19 19 15 15 16 17 19 18 14 14 19 17 14 13 17 19 17 19 19 14 18 16 19 16
## [176] 17 19 15 14 18 16 16 17 14 16 19 18 14 18 15 14 15 16 16 17 15 17 17 14 16
## [201] 15 16 13 15 15 18 14 19 17 18 18 17 18 16 17 16 15 18 19 17 16 18 19 18 17
## [226] 17 16 19 19 18 17 17 19 17 16 15 13 18 17 18 18 18 18 18 16 17 19 13 17 18
## [251] 17 18 15 19 16 18 16 16 16 14 15 18 19 18 19 19 19 19 13 16 17 17 17 18 13
## [276] 17 15 16 18 17 16 18 16 15 18 16 14 18 16 13 15 16 16 16 16 16 16 16 18 19
## [301] 16 19 13 14 15 15 14 15 18 18 15 16 17 19 15 16 18 13 17 18 15 16 17 17 17
## [326] 14 16 17 19 16 17 15 16 15 19 18 18 18 17 16 15 18 15 18 15 16 16 16 15 15
## [351] 13 15 16 15 17 19 18 16 14 19 18 17 18 18 18 15 18 17 19 19 18 19 19 17 19
## [376] 16 19 17 19 16 14 16 19 16 17 18 18 14 18 18 13 16 14 17 17 16 17 18 13 13
## [401] 16 18 17 19 15 15 18 18 18 17 17 16 13 18 19 16 18 18 17 18 17 17 19 18 16
## [426] 16 15 17 18 16 15 14 18 17 17 15 15 17 16 18 19 16 16 19 14 13 17 16 14 18
## [451] 13 15 17
You might wonder why we don’t just use select() to do the job. Try using select for the same purpose. Can you see what the problem is?
# Filter the data frame for participants who are younger than 20 and extract the age
# column using select()
nerd_red %>%
filter(age < 20) %>%
select(age)Another thing that is very useful is arrange() which lets you sort your data easily. For example, here is how you would sort the data by age:
# arrange nerd_red by age
nerd_red %>%
arrange(age)We can also arrange the data in descending order by wrapping the respective column in desc() (for “descending”).
# arrange nerd_red by age in descending order
nerd_red %>%
arrange(desc(age))Sorting with multiple columns is not a problem:
# arrange nerd_red by age and gender
nerd_red %>%
arrange(age, gender)4.1) Use the code above, but switch the order of gender and age. What happens?
nerd_red %>%
arrange(gender, age)4.2) Arrange nerd_red by voted and age. Let voted be ascending and age descending.
nerd_red %>%
arrange(voted, desc(age))We want to add some columns to the data. Most of the time, we want to calculate something like a sum score or re-code questions. The good news is that most of this follows the same logic as vector operations which we saw before.
In a rather silly example, we could calculate the birth year of each participant. We know that data has been collected from December 2015 - December 2018, but let’s assume for now that all this data is from 2018. So to get the participants’ birth year (roughly), we would need to calculate 2018 - participant age.
Whenever we want to add a new column (or modify one) within the tidyverse, we use the mutate() function. Here is how things would work for our toy example:
# create a column with the (approximate) birth year - add it before the gender column
nerd_red %>%
mutate(birthyear = 2018 - age, .before = gender)I used the .before argument so the new column would be added before the column gender. Otherwise, the new column would have been added at the end of my data frame. This argument is a little bit strange, because its name starts with a dot, but don’t worry about this.
Note that birthyear has not been saved to nerd_red right now:
# show names of nerd_red
names(nerd_red)## [1] "age" "gender" "married" "voted" "nerdy" "Q1" "Q2"
## [8] "Q3" "Q4" "Q5" "Q6" "Q7" "Q8" "Q9"
## [15] "Q10" "Q11" "Q12" "Q13" "Q14" "Q15" "Q16"
## [22] "Q17" "Q18" "Q19" "Q20" "Q21" "Q22" "Q23"
## [29] "Q24" "Q25" "Q26"
In order to do that, we would explicitly need to overwrite nerd_red, which we won’t do here, because birthyear is a very useless column for us.
Likewise, we can use functions on columns. For example, we could turn the age column into a character. That doesn’t make sense in this example, but the opposite case is often true, where you want to turn a column that is accidentally coded as a character into a numeric column.
# create a column called age_character (the age as character) and put it before the gender column
# overwrite nerd_red
nerd_red <-
nerd_red %>%
mutate(age_character = as.character(age), .after = age)
nerd_red5.1) By overwriting nerd_red, I permanently added the age_character column to the data frame. Can you turn it into a numeric column again? You can replace it using the old column name (i.e. mutate(age_character = ...)). (Of course, the name will be pretty misleading then.)
nerd_red %>%
mutate(
age2 = age - 100, .before = gender,
age = age + 1000
)nerd_red %>%
mutate(age_character = as.numeric(age_character))nerd_red %>%
mutate(age_character = age, .before = gender)5.2) According to the code book, gender as been coded as 1 = Male, 2 = Female, 3 = Other. Assume there has been a mistake and it’s actually 0 = Male, 1 = Female, 2 = Other. How can you correct the mistake? Don’t overwrite the data frame, because we want to keep the old (correct) gender column as it is.
nerd_red %>%
mutate(gender = gender - 1)Now we still have the useless age_character column in our data. If we want to get rid of it, we can do so by simply “deselecting” it:
# deselect the age_character column
# overwrite nerd_red
nerd_red %>%
select(-c(age_character, gender, nerdy))nerd_red <- nerd_red %>%
select(!age_character)nerd_redIn a lot of cases, we want to calculate the sum of columns. Say we want to calculate the the sum of Q1 and Q2 in a new column called sum_Q1_Q2. This is how we would do that:
# calculate the sum of Q1 and Q2 and save it in a column sum_Q1_Q2 - put it after Q2
nerd_red %>%
mutate(sum_Q1_Q2 = Q1 + Q2, .after = Q2)We can see how the columns Q1 and Q2 add up per row in our new column. But what if we want to calculate the sum score across all the columns Q1 - Q26? Writing all of them down with a little plus sign seems tedious.
We can achieve this with a few twists. Let’s consider the third row of this code chunk first, the bit with the mutate() function. We can see that we create a column called nerd_score. We then use the function sum(), for obvious reasons. Next, we see the function across(). This is a so-called “helper function” because it is only used inside of mutate(), so mutate() will know what columns to work on. Basically, we need across() every time mutate() needs to use the same function on multiple columns at once. (Adding two columns together is a special case.) It actually matches the verbal description I used above: I want to calculate the sum score across columns Q1 - Q26. So far, so good. However, one crucial piece of code that we need is the additional rowwise() function before using mutate(). It does exactly what the name suggests: It causes all following functions to be executed separately on every single row of a data frame. That is, we calculate the sum for every row (= participant) separately. This might seem slightly confusing because we just saw that adding vectors automatically sums up the rows. Most functions work like that in R, but some don’t - one of them is sum(). It just sums up everything, so if we used it without rowsums(), we would get the entire sum of all the elements in all the columns we selected. Note that sum() is not the same as +.
As a last step, we ungroup() the data, because rowwise() groups the data into rows and we don’t want it to stay like that (otherwise, it would). Note that I overwrote nerd_red with the version of nerd_red that now has a column with the nerd sum score.
# use rowwise ...
# to calculate the nerd score ...
# as the sum ...
# across ...
# Q1 - Q26
# ungroup at the end
nerd_red %>%
rowwise() %>%
mutate(nerd_score = sum(across(Q1:Q26)), .before = Q1) %>%
ungroup()# base R way
nerd_red$nerd_score <-
rowSums(nerd_red[ , paste0("Q", 1:26)])paste0("Q", 1:5)## [1] "Q1" "Q2" "Q3" "Q4" "Q5"
paste("Q", 1:5)## [1] "Q 1" "Q 2" "Q 3" "Q 4" "Q 5"
paste0(c("A", "B"), 1:4)## [1] "A1" "B2" "A3" "B4"
Sometimes, you want to modify existing columns in your data. For example, we could turn the numbers in the gender column into text, so it’s easier to interpret. We can see in the code book that 1 = Male, 2 = Female, 3 = Other. This is how we would recode the gender column:
# use case_when() to recode the gender column
# overwrite nerd_red
nerd_red <-
nerd_red %>%
mutate(
gender = case_when(
gender == 1 ~ "male",
gender == 2 ~ "female",
gender == 3 ~ "other"
)
)
nerd_red nerd_red %>%
mutate(
married = case_when(
married == 1 ~ married,
married == 2 ~ as.integer(999),
TRUE ~ as.integer(12345)
)
)Let’s unpack this: case_when(), as the name suggests, does stuff when a certain case occurs. Whenever the gender column is 1, we insert “male”, when it is 2, we put in “female”, and when it is 3, we put in “other”. Note that it is important to specify all possible cases. If we only had put in values for the cases 1 and 2 (so, male and female), every occurrence of 3 would have been left blank (NA).
When we have only two cases, the function ifelse() is a useful shortcut. It takes three arguments:
For example, if we want to classify these numbers into “one digit” or “more than one digit”, it looks like this:
some_numbers <- c(2, 24, 12, 5, 1, 353, 1)
# use ifelse to put "more" when some_numbers has more than one digit, and "one" if it's one digit
ifelse(some_numbers < 10, "one digit", "more than one digit")## [1] "one digit" "more than one digit" "more than one digit"
## [4] "one digit" "one digit" "more than one digit"
## [7] "one digit"
We can also modify the data based on this. For example, we could multiply the data with -1 whenever it has more than one digit, and leave it as it is when it is one digit.
# multiply some_numbers with -1 whenever it has more than one digit, and leave it as it is
# when it is one digit.
# Question: What does some_numbers contain now?
ifelse(some_numbers < 10, some_numbers, some_numbers * -1)## [1] 2 -24 -12 5 1 -353 1
Note how we have to call some_numbers, even if we’re leaving things unchanged. You can imagine it like this: Suppose you’re creating a new vector. For every position of that vector, you contemplate whether you should put in the old number from some_numbers, or whether you should multiply it with -1 before adding it.
6.1) Recode the married column. It’s 1 = Never married, 2 = Currently married, 3 = Previously married. Overwrite nerd_red with the version that contains the recoded column.
nerd_red <-
nerd_red %>%
mutate(
married = case_when(
married == 1 ~ "never married",
married == 2 ~ "currently married",
married == 3 ~ "previously married"
)
)6.2) Recode the voted column. It’s 1 = Yes, 2 = No. Use ifelse(). Overwrite nerd_red with the version that contains the recoded column.
nerd_red %>%
mutate(voted = ifelse(voted == 1, "yes", "no"))6.3) Like ifelse(), case_when() can be used outside of mutate(). Can you use case_when() with the following vector so that you: 1) add 20 when the number is below 0 2) subtract 2 when the number is above 0 3) put in 9999 when the number is exactly 0
some_more_numbers <- c(12, 0, 0, -4, 0, -12, 5, 2)case_when(
some_more_numbers < 0 ~ some_more_numbers + 20,
some_more_numbers > 0 ~ some_more_numbers - 2,
some_more_numbers == 0 ~ 9999
)## [1] 10 9999 9999 16 9999 8 3 0
One last thing. We see that every participant is a row in our data frame, but there is no participant id. This is in principle not a big issue for this data set, but for some data sets, it’s really important that everything stays in the correct order. Some operations on the data may change the order of it, and then it is crucial to be able to recover the original order of the data. We can add an id column to your data by numbering the rows.
The tidyverse has a built-in function for this, which goes like:
# use rowid_to_column() to create a new id column named id (var = "id")
nerd_red <-
nerd_red %>%
rowid_to_column(var = "id")You could also number the rows like this:
# use nrow to create the same id column (add before age)
nerd_red %>%
mutate(id = 1:nrow(.), .before = age)THIS IS A GITHUB PUSH CHECKPOINT
Notice that I’m not using quotation marks here. However, both works: library(tidyverse) and library("tidyverse") both do the same job.↩︎
This thing is a pain in the a** to type by hand. I strongly recommend creating a shortcut. You can do this via Tools \(\to\) modify keyboard shortcuts. Look for “insert pipe operator” in the search bar. I chose “alt + .” to add the pipe, simply because “alt + -” inserts the assignment operator <- per default.↩︎
In fact, a brand new pipe operator has just been added to base R, which looks like this: |>|. It works much like the tidyverse pipe, but not quite. You can’t exchange one for the other, at least not at the time of writing. The base R people is officially available since 18.05.2021 (since version 4.1.0), so it will probably face some developments during the next years.↩︎
The alternative is to wrap the code into (). Then, the output is printed even if I assign something to a variable. See the difference between example <- 2 + 3 and (example <- 2 + 3).↩︎